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A data set of Hirsch indices, h, for Finnish scientists in certain fields is statistically analyzed 
and fitted to h(n) = Pn p for the n-th most-quoted scientist. The precoefficient P is characteristic 
for the field and the exponent p is about -0.2 for all data sets considered. For Physics, Chemistry 
and Chemical Engineering, the P are 49.7(8), 41.3(6), and 21.4(6), respectively. These p values 
correspond to Pareto exponents of about -7 for the distribution of Hirsch indices h. 
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I. INTRODUCTION 

The Hirsch index h 0, 0] provides a rough but robust 
measure on the total citation impact of an individual, 
until the time of observation. More exactly it means 
having h papers, each cited at least h times. In addition 
to persons, it also can be defined for universities, journals 
etc. The values are very different for different fields and 
the question is, how to compare the values between fields? 

We had available a small data set of the h values 
in Chemistry, Physics, and Chemical Engineering for 
Finnish scientists. A statistical study reveals an inter- 
esting power-law distribution and gives a hint on the rel- 
ative weighting factors that may apply between different 
fields. 



II. METHOD AND RESULTS 

The data were determined from the ISI Web of Knowl- 
edge using the data set in General Search from 1945 on- 
wards. This database only contains references in journals 
to papers in journals. Most data points were obtained in 
November 2005. The most-quoted one-third of the points 
inside each area, k 7 was fitted using Gnuplot to 



Distributions for different sciences 



h(n) = Pn p 



(1) 



where h(n) is the h of the n:th-most quoted scientist, P 
is a precoefficient and p is an exponent, found to be sur- 
prisingly constant between different fields. The obtained 
values are shown in Table [I] and the quality of the fits is 
demonstrated in Fig. ^ The figures in parentheses give 
the asymptotic standard error. In this data set, for the 
given country at the given time, the workers in differ- 
ent areas mostly share the same background and general 
working conditions, like the typical research-group size 
and budget. Assuming that they also are equally gifted 
and hard-working, we then suggest that the ratios of P 
between different fields would form a possible basis for 
comparing scientific merit between fields. 

PodlubnyQ recently compared the total numbers of 
citations in various fields in United States. He found 
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FIG. 1: The fits (1) for Physics, Chemistry, Mathematics plus 
Computer Science, and Chemical Engineering. For the two 
latter fields the entire data sets are shown as points, although 
the fits only include the k highest points in Table 



TABLE I: The fits for certain areas, k is the number of points 
included in the fit. All data refer to Finland. 



Area 


k 


P 


P 


Medicine 


4 


90(3) 


-0.22(3) 


Bio/eco 


5 


59(4) 


-0.23(7) 


Physics 


14 


49.7(8) 


-0.169(9) 


Chemistry 


17 


41.3(6) 


-0.173(7) 


Math and Comp 


8 


23.8(1.5) 


-0.22(5) 


Chem. Eng. 


5 


21.4(6) 


-0.25(3) 



them to be fairly constant from 1992 to 2001 and sug- 
gested that they would form a useful normalization factor 
for comparing individual scientific performance between 
fields. 

In Table ILT1 we compare the present relative P re i factors 
(with Physics normalized to 1) to the square roots of 
Podlubny's relative citation numbers. An average of his 
1992-2001 data is used. 

Recall here that the lower limit for the total number 
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TABLE II: The relative prefactors, P rc i, with Physics nor- 
malized to one and the square roots of the number of total 
citations, (C rc i) 1 ' /2 , with Physics normalized to one. 



Area 



(Crcl) 1/2 



Medicine 
Bio/Eco 
Physics 
Chemistry 
Math and Comp 
Chem. Eng. 



1.8 
1.2 

1 

0.83 
0.48 
0.43 



2.0 
1 

0.88 
0.23 



of citations, iV Cjtot = h 2 and a typical number isQ 

N c , tot = ah 2 , (2) 

with a about 3-5 0- 

III. FURTHER DATA SETS 

A list of the h values for 40 'Dutch' chemists was pub- 
lished by Faasjj]. Both people of Dutch origin anywhere 
in the World, and people from anywhere, working in The 
Netherlands were included. As seen from Fig[2 all points 
fit well the values P = 105.5(2.4), p = -0.212(11). 



Distributions of h(n) for Dutch chemists 
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FIG. 2: The h values of forty 'Dutch' chemists from Faas0 
The line is fitted to the points 1-20 and has p = -0.212(11). 



IV. RELATION TO THE PARETO 
DISTRIBUTION 

In economic theory, V. Pareto found in 1896 [f| the 
number of holders of income / in a country to scale for 
high incomes as I x , with x about -2 (0, see ref. 0,0). 
The same law was found by Zipf to hold for word fre- 
quencies in linguistics and by Lotka for numbers of pa- 
pers among authors^. It is known in many other fields, 
like size distributions of cities in a country, earthquakes, 
wars etc. 0. 

From eq. 

n(h) = {h/Pflv. (3) 
Introducing the density of individuals per unit of N{h) 1 

poo 

n= N(ti)dti, (4) 

we can interprete N(h) as the derivative 

N = -dn/dh. (5) 

Then, using eq|3| 

N(h) = p- 1/p hp-\ (6) 

For the Finnish p for Physics and Chemistry, the corre- 
sponding Pareto exponent x would become -6.9 and -6.8, 
respectively. 

The main conclusions are that the P value for Chemi- 
cal Engineering is about half of that for Chemistry, and 
that the p values for the data sets considered are about 
-0.2. 
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